Generating Keyword Lists Related to Topics Represented by an Array of Topic Records, for Use in Targeting Online Advertisements and Other Uses

ABSTRACT

A computer-implemented method of generating, from an array of topic records, an output array of keywords for use in targeting online advertisements related to topics represented by the array of topic records, includes obtaining the array of topic records in computer-readable form, determining a relevancy value for words and topics, classifying the topic vectors in the plurality of topic vectors into a high-volume class or a low-volume class, generating and storing an array of seed keywords derived by sampling from an embedded space, ranking the seed keywords to form an array of ranked keywords, updating the array of ranked keywords based on keyword cost-per-click data, iterating the ranking and updating at least once, and evaluating an optimization improvement value for an iteration.

FIELD OF THE INVENTION

The present disclosure generally relates to computer systems that process semantic data. The disclosure relates more particularly to apparatus and techniques for processing data related to topic records wherein topic records represent meaning to users and outputting data relating to weighted, filtered, and/or sorted lists of keywords usable as inputs to an online advertising system.

BACKGROUND

Online advertising can be a useful method of advertising, if the right advertisements reach the right consumers or potential consumers. With rapidly changing interests, many different products/services to offer, and competing advertisers, achieving favorable results from an advertising campaign can be difficult. In a simple implementation, a marketing campaign manager inputs desired keywords, considers bids and offers to place advertisements based on those keywords and then places advertisements.

Selection of right and relevant keywords for specific business domain is an important task when it comes to boosting the performance of an advertising campaign, especially when the costs are a function of the number of viewers who react to an advertisement, such as clicking on a displayed advertisement (the pay-per-click model). Many marketing specialists base their keyword selection on the exploration of large keyword databases, which can be a time-consuming process. One drawback to that approach is that it requires considerable human intervention and thus is often only available to larger organizations.

SUMMARY

A semantic processor is programmed to evaluate a topic data structure using a topic module, keyword module, and the optimization module in a process of keyword selection for keywords to be used as an input to a keyword-based online advertising purchasing computer system, as well as other purposes. In some embodiments, the semantic processor can derive keyword sets in a fully automated way without human intervention and without need for time-consuming processes of exploring large keyword databases.

A computer-implemented method of generating, from an array of topic records, an output array of keywords for use in targeting online advertisements related to topics represented by the array of topic records, includes obtaining the array of topic records in computer-readable form, determining a relevancy value for words and topics, classifying the topic vectors in the plurality of topic vectors into a high-volume class or a low-volume class, generating and storing an array of seed keywords derived by sampling from an embedded space, ranking the seed keywords to form an array of ranked keywords, updating the array of ranked keywords based on keyword cost-per-click data, iterating the ranking and updating at least once, and evaluating an optimization improvement value for an iteration when either the optimization improvement value for the iteration is below a pre-determined threshold or the maximum time limit for the optimization is reached, generate the output array of keywords from the array of ranked keywords.

When classifying the topic vectors in the plurality of topic vectors into a high-volume class that night comprise a paragraph vector model step and classifying the topic vectors in the plurality of topic vectors into a low-volume class comprises a PMI-SVD model step. Obtaining the array of topic records in computer-readable form might comprise sending user interface data representing a user interface to a user device, obtaining a user reply from the user device, and generating the array of topic records from the user reply. Determining the relevancy value for the word and topic might comprise computing a cosine distance between the word vector and the topic vector.

These operations might be performed by executing executable instructions stored in a non-transitory computer-readable storage medium that, when executed by one or more processors of a computer system, cause the computer system to perform those operations.

The following detailed description together with the accompanying drawings will provide a better understanding of the nature and advantages of the present invention.

BRIEF DESCRIPTION OF THE DRAWINGS

Various embodiments in accordance with the present disclosure will be described with reference to the drawings, in which:

FIG. 1 illustrates an overview of how a semantic processor might be used to generate a keyword set from a topics list, including a topic module, a ranker module and a keyword suggester module.

FIG. 2 illustrates internal logic of the topic module of FIG. 1.

FIG. 3 illustrates a paragraph vector model.

FIG. 4 illustrates an example of a process that a semantic processor might use to implement a selection module.

FIG. 5 illustrates an example of a keyword list data structure.

FIG. 6 is a block diagram of various components, including inputs from human users and computer processes.

FIG. 7 is a block diagram that illustrates a computer system upon which an embodiment of the invention may be implemented.

FIG. 8 is a block diagram of an example of memory structures as might be used to implement functions described herein.

DETAILED DESCRIPTION

In the following description, various embodiments will be described. For purposes of explanation, specific configurations and details are set forth in order to provide a thorough understanding of the embodiments. However, it will also be apparent to one skilled in the art that the embodiments may be practiced without the specific details. Furthermore, well-known features may be omitted or simplified in order not to obscure the embodiment being described.

According to one embodiment, the techniques described herein are implemented by one or generalized computing systems programmed to perform the techniques pursuant to program instructions in firmware, memory, other storage, or a combination. Special-purpose computing devices may be used, such as desktop computer systems, portable computer systems, handheld devices, networking devices or any other device that incorporates hard-wired and/or program logic to implement the techniques.

In this description, an embodiment might have as data structures for input, output, or processing, such as data structures for topics, keywords, campaigns and other data structures to represent data being handled. Some of these data structures might be storable as simple strings, or as arrays of strings or other data elements.

A topic data element can represent concepts that can be expressed by user input. For example, topics might relate to interests, descriptions of business purposes, general themes, products, locations, times of day, types of business, keywords, and other expressions. Topics might be used to describe a target customer group or a product group of an enterprise.

An array of topic records could be used in a storable and readable representation, digital or otherwise, representing a plurality of topic data elements. Arrays of topic records might be mutable and can be inputs or outputs of different processes that are performed as described herein.

An embodiment might also have keyword data structures, wherein a keyword data structure represents some standardized representation of a targeting scope of a campaign, expressed in a natural or in another language. These representations include, but are not limited to, actual keywords (e.g., keywords that form inputs to the Google advertising platform or other keyword-based advertising platform or where advertising is purchased or presented based on selected keywords), a selection of user interests on the Facebook advertising platform or other social media platforms that filter or present content based on data structures representing user interests), an encoding of a distribution of targets in terms of single users or user groups, and embeddings in a learned or predefined targeting space.

Arrays of keyword records might be stored as readable data representing, keyword entities and might be mutable and inputs or outputs of different processes that are performed as described herein.

Users of the systems described herein might be humans that are interacting with the systems, such as business owners who make use of the system to generate campaigns, and potential customers who are targeted by those campaigns in an online setting, as well as those maintaining and managing such systems. A marketer might be a human who interacts with a backend portion of the system. These interactions might include manual optimization of different campaign parameters, making decisions about future steps in the campaign's lifecycle, and adding knowledge into the system to inform future decisions. A campaign data structure is a representation of data that is in part generated and in part based on user input that describes a targeted online advertising process. As an example, campaign data might describe aspects of a targeted online advertising process over a lifetime of a campaign. A campaign might encompass different parameters such as topics, keywords, budget allocations, and visual elements. A campaign might be mutable over its lifetime and might be optimized before, during, and/or after by adjusting the parameters of the campaign. Campaign data structures can be compared to detect similarities between campaigns and inform subsequent decisions.

FIG. 1 illustrates how a semantic processor might derive keywords sets from given topic list. The topic list is obtained from a user or other source and formed into a topic list data structure. The topic list data structure is passed to the topic module. A task of the topic module is to retrieve most relevant and semantically close words for the inputted list. The retrieved list will likely represent most relevant words that revolve around the passed topics. In this process, these words are called “seed keywords.” Topics may refer to categories more generally, such as an area of business of a user. For example, for a baker, the category of interest there might be one of “I am a baker” or the like.

These seed keywords are passed to a selection module, which can be an iterative process for deriving new keyword ideas and selecting most relevant ones for inputted topic list. For that purpose, selection module uses two parts: a ranker and a keyword suggester. The keyword suggester is responsible for deriving new keywords ideas and ranker can be seen as a procedure that ranks and filters ideas that are coining from keyword module. The selection module then iterates between ranking and deriving new keywords idea until keyword set reaches desired performance threshold. Now each module will be described.

A keyword selection process might start with seed keywords, which represent core products or services an entity might want to advertise in their PPC campaigns. A main task of a topic module is to derive relevant seed keywords from inputted topic list so that the selection module can narrow them down to niche specific keywords. For these purposes, the topics and seed keywords can be modeled as vector representations for determining relevancy of every seed keyword inside each topic and retrieving the most relevant ones for the selection module. Relevancy of each word inside given topic might be determined by a cosine distance, computed as in Equation 1.

$\begin{matrix} {{\cos \left( {t,w} \right)} = \frac{t \cdot w}{{t} \cdot {w}}} & \left( {{Eqn}.\mspace{11mu} 1} \right) \end{matrix}$

In Equation 1, w represents a word vector and t represents a topic vector. The semantic processor seeds keywords with a lowest cosine distance to a given set of topics or a single topic is used as seed keywords for the selection module. In order to achieve quality embeddings, the semantic processor might use two methods for embeddings based on topic classification based on an available amount of training data, with one method for low volume topics and another method for high volume topics.

For low volume topics, the semantic processor uses a PMI-SVD method and for high volume topics, the semantic processor uses a Paragraph Vector model method.

The topic module can be constructed from simple handcrafted heuristic processes based on a topic amount of data and the semantic processor selects appropriate embeddings for deriving seed keywords.

The word vectors might be obtained by training the semantic processor with corpuses of text from real world language and/or keyword sets from previous campaigns. The training might be distinct for distinct topic categories. The mapping could be from an initial category to all possible mappings that could be embedded using natural language processing, such as interests, keywords, behaviors, etc.

In turn, seed keywords might be derived by sampling from an embedded space based on selected topics, sampling the best keyword candidates from the space. The embedded space might be obtained by training vector models with large corpuses of natural language data and from previous keyword sets of campaigns.

Data for the update criteria might be keyword cost per click data and the search volume of keywords. These metrics can be obtained from third-party sources that track this data over time. This could be modeled as a mapping of an input array of generic topic records to both keywords and interests.

Filters might be applied, such as target customer age ranges, customer location, and/or customer language.

FIG. 2 illustrates internal logic of the topic module of FIG. 1, including a pipeline of the topic module. The topic module first classifies topics as low volume and high volume topics. Based on their classification, different type of embeddings are used. The semantic processor can be programmed to implement the various computations represented by the equations herein.

The PMI-SVD method relies on co-occurrence based word association measure point-wise mutual information (PMI) and a factorization technique singular value decomposition (SVD). First, the semantic processor creates a shared vocabulary, V_(shared), as shown in Equation 2.

V _(shared) =U _(i=1) ^(N) V _(topic) ₁ (2)  (Eqn. 2)

In Equation 2, N denotes the number of topics and V_(topic-i) denotes the vocabulary of the i-th topic. Each individual topic vocabulary is created by taking top T words ordered by their frequency.

Topics data sets are tokenized with vocabulary V_(shared). Tokenized datasets are used in the semantic processor's PMI-SVD process to compose PMI matrices for each and every low volume topic, resulting in P={P^(topic1), . . . , P^(topicn)} PMI matrices. Each entry in PMI matrix represents a PMI value of word x and wordy calculated from context window c of topic “i.”

The PMI value can be calculated as in Equation 3. The semantic processor might reduce the dimensionality of each PMI matrix with SVD in order to yield seed keyword vectors of size 100. Topic vectors are an average of all seed keyword vectors of a topic, calculated as in Equation 4.

$\begin{matrix} {{PMI}_{x,y}^{{topic}^{i}} = {\log \frac{p\left( {x,y} \right)}{{p(x)}{p(y)}}}} & \left( {{Eqn}.\mspace{11mu} 3} \right) \\ {{topic}_{i} = {\frac{1}{P^{{topic}^{i}}}{\sum_{i = 1}^{P^{{topic}^{i}}}{seed}_{{keyword}_{i}}}}} & \left( {{Eqn}.\mspace{11mu} 4} \right) \end{matrix}$

In the Distributed Paragraph Vector model, the semantic processor extends the word vector model, referred to herein as “CBOW” in a way that it adds another input to the model. That additional input to the model is randomly initialized vector that represents the document/topic/interest the target word is part of. In that sense, the additional vector can be seen as additional memory that captures all essential information needed to represent properties of the topic/interest/document.

Formally, the semantic processor could predict a next word in a sequence with a softmax multiclass classifier as in Equation 5, where y_(wt) is an un-normalized log-probability of output word w_(t), and each of y_(i) is an un-normalized log-probability for each output word i.

$\begin{matrix} {{P\left( {\left. w_{t} \middle| w_{t - k} \right.,\ldots \;,w_{t + k}} \right)} = \frac{e^{y_{wt}}}{\Sigma_{i}e^{y_{i}}}} & \left( {{Eqn}.\mspace{11mu} 5} \right) \end{matrix}$

The semantic processor can compute un-normalized log-probabilities as in Equation 6, where b stands for bias, U is weight matrix and h is as shown in Equation 7, where w₁, . . . , w_(n) are word vectors from an embedding matrix W, t_(i) is a topic embedding from embedding matrix T, and h denotes concatenation or averaging of word vectors and a topic vector.

y=b+Uh  (Eqn. 6)

h=[w _(i) ; . . . ,w _(n) ;t _(i)]  (Eqn. 7)

The Distributed Paragraph Vector model can share the same objective function with that of a CBOW model. Formally, given a sequence of words word₁, word₂, . . . , word_(n), the objective of the model is to maximize an average log probability as in Equation 8, wherein c stands for word context window size.

$\begin{matrix} {\frac{1}{N}{\sum_{i = 1}^{N}{\log \; {p\left( {\left. {word}_{1} \middle| {word}_{1 - c} \right.,{\ldots \mspace{14mu} {word}_{i - 1}},{word}_{i + 1},\ldots \;,{word}_{i + C}} \right)}}}} & \left( {{Eqn}.\mspace{11mu} 8} \right) \end{matrix}$

An overview of a Distributed Paragraph Vector model can be seen in FIG. 3. The vocabulary V_(topic) can be constructed for all high volume topics and reused as vocabulary V_(shared). Embedding matrices W and T are constructed for their corresponding vocabularies, where W corresponds to a seed keywords vocabulary and T corresponds to a topic vocabulary.

The rest of the architecture can be constructed according to the above-mentioned equations. For data gathering purposes, a custom heuristic can be used that operates on a Common Crawl Corpus. Documents gathered from a Common Crawl process might be automatically annotated with appropriate topics tags so that the semantic processor can learn topic vectors.

In one specific implementation, documents are clustered inside 39 topics, ranging from Greek cuisine to gardening. For the training, 100 dimensions for topic and word vectors were used, with a window size of four words and learning rate of 0.025. Optimization of the model can be done by SGD.

The selection module is an iterative procedure of ranking and generating new keyword ideas. An overview of a selection module process is shown in FIG. 4. Two main parts of the selection module are the ranker and the keyword suggester. The ranker might score keyword sets by a predetermined performance metric and might also filter non-relevant keywords. In the semantic processor, the ranker can be implemented in a way that the keyword set score is computed by summing individual keyword scores from the set.

A keyword set score is illustrated by Equation 10. The final keyword set should contain keywords that have a search volume that can lead to user conversion and low cost-per-click so that it boosts the effectiveness of a campaign. Keyword performance metric might be a computed value that is computed from a reciprocal value of a keyword cost-per-click value multiplied by a normalized keyword search volume value. The result can be used as a keyword score for a computer process that evaluates keywords. With the keyword score, the semantic processor can rule out keywords that have a high cost-per-click value by taking a reciprocal value of the keyword cost per click and favoring low-cost/moderate cost keywords with sufficient search volume. The keyword score might be generated as in Equation 9, where d is a distance function, w_(keyword) is a keyword vector and t_(initial) is a topic vector of interest. In other variations, different distance functions might be used. The distance function might be cosine distance, Euclidean distance, or some natural language processing distance function.

$\begin{matrix} {{Keyword}_{score} = {\frac{1}{{Keyword}_{cpc}}*{Keyword}_{NSV}*\frac{1}{d\left( {w_{keyword},t_{initial}} \right)}}} & \left( {{Eqn}.\mspace{11mu} 9} \right) \\ {\mspace{79mu} {{KeywordSet}_{score} = {\sum_{i = 1}^{N}{Keyword}_{score}^{i}}}} & \left( {{Eqn}.\mspace{11mu} 10} \right) \end{matrix}$

The array of keywords might be updated by calculating a keyword score for the array. The keyword score might be calculated as a normalized search volume divided by a keyword cost per click. The distance term indicates how related a selected keyword is to the original topic. A high distance would penalize the keyword score, since the keyword is not related at all to the initial topics, and a small distance would mean that the keyword is highly related to the original topic and it is therefore beneficial to include in the original set.

The keyword suggester might be programmed to process a heuristic that operates on large keyword databases and retrieve keyword suggestions based on the defined input. The semantic processor might use a third-party service, such as the Google Targeting Idea service, as keyword suggester module. That service allows the semantic processor to retrieve targeting keyword ideas from various parameters, such as keyword list, location, language, product category and others.

The selection module iterates between the ranker and the keyword suggester until there is no further improvement in keyword set or a desired number of optimization rounds is reached. The final set from iteration is output in a form that can be used in PPC campaign targeting settings and other uses.

FIG. 5 illustrates an example of a keyword list data structure.

FIG. 6 is a block diagram of various components, including inputs from human users and computer processes. As shown there, an advertisement designer 606 might provide user input to a visual ad generator 604. The visual ad generator 604 could generate campaign data 610 that can be passed to a campaign storage and management system 612. The visual ad generator 604 can also pull in campaign data previously stored.

A marketing reviewer 616 can provide feedback to a keyword learning system 614 that can get feedback also from an A/B testing system 618. A budget reviewer 620 can submit a budget plan to a budget allocator system 622 that interacts with an adjustment triggering system 624, which can in turn provide feedback to the visual ad generator 604.

FIG. 7 is a block diagram that illustrates a computer system 700 upon which an embodiment of the invention may be implemented. Computer system 700 includes a bus 702 or other communication mechanism for communicating information, and a processor 704 coupled with bus 702 for processing information. Processor 704 may be, for example, a general purpose microprocessor.

Computer system 700 also includes a main memory 706, such as a random access memory (RAM) or other dynamic storage device, coupled to bus 702 for storing information and instructions to be executed by processor 704. Main memory 706 also may be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor 704. Such instructions, when stored in non-transitory storage media accessible to processor 704, render computer system 700 into a special-purpose machine that is customized to perform the operations specified in the instructions.

Computer system 700 further includes a read only memory (ROM) 708 or other static storage device coupled to bus 702 for storing static information and instructions for processor 704. A storage device 710, such as a magnetic disk or optical disk, is provided and coupled to bus 702 for storing information and instructions.

Computer system 700 may be coupled via bus 702 to a display 712, such as a computer monitor, for displaying information to a computer user. An input device 714, including alphanumeric and other keys, is coupled to bus 702 for communicating information and command selections to processor 704. Another type of user input device is cursor control 716, such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to processor 704 and for controlling cursor movement on display 712. This input device typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis e.g., y), that allows the device to specify positions in a plane.

Computer system 700 may implement the techniques described herein using customized hard-wired logic, one or more ASICs or FPGAs, firmware and/or program logic which in combination with the computer system causes or programs computer system 700 to be a special-purpose machine. According to one embodiment, the techniques herein are performed by computer system 700 in response to processor 704 executing one or more sequences of one or more instructions contained in main memory 706. Such instructions may be read into main memory 706 from another storage medium, such as storage device 710. Execution of the sequences of instructions contained in main memory 706 causes processor 704 to perform the process steps described herein. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions.

The term “storage media” as used herein refers to any non-transitory media that store data and/or instructions that cause a machine to operation in a specific fashion. Such storage media may comprise non-volatile media and/or volatile media. Non-volatile media includes, for example, optical or magnetic disks, such as storage device 710. Volatile media includes dynamic memory, such as main memory 706. Common forms of storage media include, for example, a floppy disk, a flexible disk, hard disk, solid state drive, magnetic tape, or any other magnetic data storage medium, a CD-ROM, any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, an EPROM, a FLASH-EPROM, NVRAM, any other memory chip or cartridge.

Storage media is distinct from but may be used in conjunction with transmission media. Transmission media participates in transferring information between storage media. For example, transmission media includes coaxial cables, copper wire and fiber optics, including the wires that comprise bus 702. Transmission media can also take the form of acoustic or light waves, such as those generated during radio-wave and infra-red data communications.

Various forms of media may be involved in carrying one or more sequences of one or more instructions to processor 704 for execution. For example, the instructions may initially be carried on a magnetic disk or solid state drive of a remote computer. The remote computer can load the instructions into its dynamic memory and send the instructions over a network connection. A modem or network interface local to computer system 700 can receive the data. Bus 702 carries the data to main memory 706, from which processor 704 retrieves and executes the instructions. The instructions received by main memory 706 may optionally be stored on storage device 710 either before or after execution by processor 704.

Computer system 700 also includes a communication interface 718 coupled to bus 702. Communication interface 718 provides a two-way data communication coupling to a network link 720 that is connected to a local network 722. For example, communication interface 718 may be an integrated services digital network (ISDN) card, cable modem, satellite modem, or a modem to provide a data communication connection to a corresponding type of telephone line. Wireless links may also be implemented. In any such implementation, communication interface 718 sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.

Network link 720 typically provides data communication through one or more networks to other data devices. For example, network link 720 may provide a connection through local network 722 to a host computer 724 or to data equipment operated by an Internet Service Provider (ISP) 726. ISP 726 in turn provides data communication services through the world wide packet data communication network now commonly referred to as the “Internet” 728. Local network 722 and Internet 728 both use electrical, electromagnetic or optical signals that carry digital data streams. The signals through the various networks and the signals on network link 720 and through communication interface 718, which carry the digital data to and from computer system 700, are example forms of transmission media.

Computer system 700 can send messages and receive data, including program code, through the network(s), network link 720 and communication interface 718. In the Internet example, a server 730 might transmit a requested code for an application program through Internet 728, ISP 726, local network 722 and communication interface 718. The received code may be executed by processor 704 as it is received, and/or stored in storage device 710, or other non-volatile storage for later execution.

FIG. 8 illustrates an example of memory elements that might be used by a processor to implement elements of the embodiments described herein. For example, where a functional block is referenced, it might be implemented as program code stored in memory. FIG. 8 is a simplified functional block diagram of a storage device 848 having an application that can be accessed and executed by a processor in a computer system. The application can one or more of the applications described herein, running on servers, clients or other platforms or devices and might represent memory of one of the clients and/or servers illustrated elsewhere. Storage device 848 can be one or more memory devices that can be accessed by a processor and storage device 848 can have stored thereon application code 850 that can be configured to store one or more processor readable instructions. The application code 850 can include application logic 852, library functions 854, and file I/O functions 856 associated with the application.

Storage device 848 can also include application variables 862 that can include one or more storage locations configured to receive input variables 864. The application variables 862 can include variables that are generated by the application or otherwise local to the application. The application variables 862 can be generated, for example, from data retrieved from an external source, such as a user or an external device or application. The processor can execute the application code 850 to generate the application variables 862 provided to storage device 848.

One or more memory locations can be configured to store device data 866. Device data 866 can include data that is sourced by an external source, such as a user or an external device. Device data 866 can include, for example, records being passed between servers prior to being transmitted or after being received. Other data 868 might also be supplied.

Storage device 848 can also include a log file 880 having one or more storage locations 884 configured to store results of the application or inputs provided to the application. For example, the log file 880 can be configured to store a history of actions.

The memory elements of FIG. 8 might be used for a server or computer that interfaces with a user, generates keyword lists, and/or manages other aspects of a process described herein.

Operations of processes described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. Processes described herein (or variations and/or combinations thereof) may be performed under the control of one or more computer systems configured with executable instructions and may be implemented as code (e.g., executable instructions, one or more computer programs or one or more applications) executing collectively on one or more processors, by hardware or combinations thereof. The code may be stored on a computer-readable storage medium, for example, in the form of a computer program comprising a plurality of instructions executable by one or more processors. The computer-readable storage medium may be non-transitory.

Conjunctive language, such as phrases of the form “at least one of A, B, and C.” or “at least one of A, B and C,” unless specifically stated otherwise or otherwise clearly contradicted by context, is otherwise understood with the context as used in general to present that an item, term, etc., may be either A or B or C, or any nonempty subset of the set of A and B and C. For instance, in the illustrative example of a set having three members, the conjunctive phrases “at least one of A, B, and C” and “at least one of A, B and C” refer to any of the following sets: {A}, {B}, {C}, {A, B}, {A, C}, {B, C}, {A, B, C}, Thus, such conjunctive language is not generally intended to imply that certain embodiments require at least one of A, at least one of B and at least one of C each to be present.

The use of any and all examples, or exemplary language (e.g., “such as”) provided herein, is intended merely to better illuminate embodiments of the invention and does not pose a limitation on the scope of the invention unless otherwise claimed. No language in the specification should be construed as indicating any non-claimed element as essential to the practice of the invention.

In the foregoing specification, embodiments of the invention have been described with reference to numerous specific details that may vary from implementation to implementation. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense. The sole and exclusive indicator of the scope of the invention, and what is intended by the applicants to be the scope of the invention, is the literal and equivalent scope of the set of claims that issue from this application, in the specific form in which such claims issue, including any subsequent correction.

Further embodiments can be envisioned to one of ordinary skill in the art after reading this disclosure. In other embodiments, combinations or sub-combinations of the above-disclosed invention can be advantageously made. The example arrangements of components are shown for purposes of illustration and it should be understood that combinations, additions, re-arrangements, and the like are contemplated in alternative embodiments of the present invention. Thus, while the invention has been described with respect to exemplary embodiments, one skilled in the art will recognize that numerous modifications are possible.

For example, the processes described herein may be implemented using hardware components, software components, and/or any combination thereof. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense. It will, however, be evident that various modifications and changes may be made thereunto without departing from the broader spirit and scope of the invention as set forth in the claims and that the invention is intended to cover all modifications and equivalents within the scope of the following claims.

All references, including publications, patent applications, and patents, cited herein are hereby incorporated by reference to the same extent as if each reference were individually and specifically indicated to be incorporated by reference and were set forth in its entirety herein. 

What is claimed is:
 1. A computer-implemented method of generating, from an array of topic records, an output array of keywords for use in targeting online advertisements related to topics represented by the array of topic records, the method comprising: obtaining the array of topic records in computer-readable form; storing the array of topic records as an array of topic vectors, wherein topic vectors encode for topics; obtaining data that is used to represent word vectors and storing the data as an array of word vectors; determining, for each word vector of a plurality of word vectors in the array of word vectors and topic vector of a plurality of topic vectors in the array of topic vectors, a relevancy value for the word and topic; classifying the topic vectors in the plurality of topic vectors into a high-volume class or a low-volume class; generating and storing an array of seed keywords derived by sampling from an embedded space; ranking the seed keywords to form an array of ranked keywords; updating the array of ranked keywords based on a keyword score that can be computed from measurable metrics; iterating the ranking and updating at least once; evaluating an optimization improvement value for an iteration; and when either the optimization improvement value for the iteration is below a pre-determined threshold or the maximum time limit for the optimization is reached, generating the output array of keywords from the array of ranked keywords.
 2. The computer-implemented method of claim 1, wherein classifying the topic vectors in the plurality of topic vectors into a high-volume class comprises a paragraph vector model step and classifying the topic vectors in the plurality of topic vectors into a low-volume class comprises a PMI-SVD model step.
 3. The computer-implemented method of claim 1, wherein obtaining the array of topic records in computer-readable form comprises: sending user interface data representing a user interface to a user device; obtaining a user reply from the user device; and generating the array of topic records from the user reply.
 4. The computer-implemented method of claim 1, wherein determining the relevancy value for the word and topic comprises computing a cosine distance between the word vector and the topic vector.
 5. A non-transitory computer-readable storage medium having stored thereon executable instructions that, when executed by one or more processors of a computer system, cause the computer system to at least: perform the operations described in claim
 1. 